Risk-Averse Control of Undiscounted Transient Markov Models

نویسندگان

  • Özlem Çavus
  • Andrzej Ruszczynski
چکیده

We use Markov risk measures to formulate a risk-averse version of the undiscounted total cost problem for a transient controlled Markov process. We derive risk-averse dynamic programming equations and we show that a randomized policy may be strictly better than deterministic policies, when risk measures are employed. We illustrate the results on an optimal stopping problem and an organ transplant problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On terminating Markov decision processes with a risk-averse objective function

We consider a class of undiscounted terminating Markov decision processes with a risk-averse exponential objective function and compact constraint sets. After assuming the existence of an absorbing cost-free terminal state , positive transition costs away from , and continuity of the transition probability and cost functions, we establish (i) the existence of a real-valued optimal cost function...

متن کامل

Second Order Optimality in Transient and Discounted Markov Decision Chains

Abstract. The article is devoted to second order optimality in Markov decision processes. Attention is primarily focused on the reward variance for discounted models and undiscounted transient models (i.e. where the spectral radius of the transition probability matrix is less then unity). Considering the second order optimality criteria means that in the class of policies maximizing (or minimiz...

متن کامل

Risk-averse dynamic programming for Markov decision processes

We introduce the concept of a Markov risk measure and we use it to formulate risk-averse control problems for two Markov decision models: a finite horizon model and a discounted infinite horizon model. For both models we derive risk-averse dynamic programming equations and a value iteration method. For the infinite horizon problem we also develop a risk-averse policy iteration method and we pro...

متن کامل

Risk premiums and certainty equivalents of loss-averse newsvendors of bounded utility

Loss-averse behavior makes the newsvendors avoid the losses more than seeking the probable gains as the losses have more psychological impact on the newsvendor than the gains. In economics and decision theory, the classical newsvendor models treat losses and gains equally likely, by disregarding the expected utility when the newsvendor is loss-averse. Moreover, the use of unbounded utility to m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Control and Optimization

دوره 52  شماره 

صفحات  -

تاریخ انتشار 2014